Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 52
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Math Biol ; 88(3): 29, 2024 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-38372830

RESUMO

Reticulations in a phylogenetic network represent processes such as gene flow, admixture, recombination and hybrid speciation. Extending definitions from the tree setting, an anomalous network is one in which some unrooted tree topology displayed in the network appears in gene trees with a lower frequency than a tree not displayed in the network. We investigate anomalous networks under the Network Multispecies Coalescent Model with possible correlated inheritance at reticulations. Focusing on subsets of 4 taxa, we describe a new algorithm to calculate quartet concordance factors on networks of any level, faster than previous algorithms because of its focus on 4 taxa. We then study topological properties required for a 4-taxon network to be anomalous, uncovering the key role of [Formula: see text]-cycles: cycles of 3 edges parent to a sister group of 2 taxa. Under the model of common inheritance, that is, when each gene tree coalesces within a species tree displayed in the network, we prove that 4-taxon networks are never anomalous. Under independent and various levels of correlated inheritance, we use simulations under realistic parameters to quantify the prevalence of anomalous 4-taxon networks, finding that truly anomalous networks are rare. At the same time, however, we find a significant fraction of networks close enough to the anomaly zone to appear anomalous, when considering the quartet concordance factors observed from a few hundred genes. These apparent anomalies may challenge network inference methods.


Assuntos
Algoritmos , Prevalência , Filogenia
2.
Nat Commun ; 14(1): 7173, 2023 11 07.
Artigo em Inglês | MEDLINE | ID: mdl-37935674

RESUMO

Tradeoffs between the energetic benefits and costs of traits can shape species and trait distributions along environmental gradients. Here we test predictions based on such tradeoffs using survival, growth, and 50 photosynthetic, hydraulic, and allocational traits of ten Eucalyptus species grown in four common gardens along an 8-fold gradient in precipitation/pan evaporation (P/Ep) in Victoria, Australia. Phylogenetically structured tests show that most trait-environment relationships accord qualitatively with theory. Most traits appear adaptive across species within gardens (indicating fixed genetic differences) and within species across gardens (indicating plasticity). However, species from moister climates have lower stomatal conductance than others grown under the same conditions. Responses in stomatal conductance and five related traits appear to reflect greater mesophyll photosynthetic sensitivity of mesic species to lower leaf water potential. Our data support adaptive cross-over, with realized height growth of most species exceeding that of others in climates they dominate. Our findings show that pervasive physiological, hydraulic, and allocational adaptations shape the distributions of dominant Eucalyptus species along a subcontinental climatic moisture gradient, driven by rapid divergence in species P/Ep and associated adaptations.


Assuntos
Eucalyptus , Árvores , Árvores/fisiologia , Folhas de Planta/fisiologia , Clima , Fotossíntese , Água , Eucalyptus/fisiologia , Vitória
3.
bioRxiv ; 2023 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-37662314

RESUMO

Reticulations in a phylogenetic network represent processes such as gene flow, admixture, recombination and hybrid speciation. Extending definitions from the tree setting, an anomalous network is one in which some unrooted tree topology displayed in the network appears in gene trees with a lower frequency than a tree not displayed in the network. We investigate anomalous networks under the Network Multispecies Coalescent Model with possible correlated inheritance at reticulations. Focusing on subsets of 4 taxa, we describe a new algorithm to calculate quartet concordance factors on networks of any level, faster than previous algorithms because of its focus on 4 taxa. We then study topological properties required for a 4-taxon network to be anomalous, uncovering the key role of 32-cycles: cycles of 3 edges parent to a sister group of 2 taxa. Under the model of common inheritance, that is, when each gene tree coalesces within a species tree displayed in the network, we prove that 4-taxon networks are never anomalous. Under independent and various levels of correlated inheritance, we use simulations under realistic parameters to quantify the prevalence of anomalous 4-taxon networks, finding that truly anomalous networks are rare. At the same time, however, we find a significant fraction of networks close enough to the anomaly zone to appear anomalous, when considering the quartet concordance factors observed from a few hundred genes. These apparent anomalies may challenge network inference methods.

4.
Syst Biol ; 2023 Sep 12.
Artigo em Inglês | MEDLINE | ID: mdl-37698548

RESUMO

The evolutionary implications and frequency of hybridization and introgression are increasingly being recognized across the tree of life. To detect hybridization from multi-locus and genome-wide sequence data, a popular class of methods are based on summary statistics from subsets of 3 or 4 taxa. However, these methods often carry the assumption of a constant substitution rate across lineages and genes, which is commonly violated in many groups. In this work, we quantify the effects of rate variation on the D test (also known as ABBA-BABA test), the D3 test, and HyDe. All three tests are used widely across a range of taxonomic groups, in part because they are very fast to compute. We consider rate variation across species lineages, across genes, their lineage-by-gene interaction, and rate variation across gene-tree edges. We simulated species networks according to a birth-death-hybridization process so as to capture a range of realistic species phylogenies. For all three methods tested, we found a marked increase in the false discovery of reticulation (type-1 error rate) when there is rate variation across species lineages. The D3 test was the most sensitive, with around 80% type-1 error, such that D3 appears to more sensitive to a departure from the clock than to the presence of reticulation. For all three tests, the power to detect hybridization events decreased as the number of hybridization events increased, indicating that multiple hybridization events can obscure one another if they occur within a small subset of taxa. Our study highlights the need to consider rate variation when using site-based summary statistics, and points to the advantages of methods that do not require assumptions on evolutionary rates across lineages or across genes.

5.
Syst Biol ; 72(5): 1171-1179, 2023 11 01.
Artigo em Inglês | MEDLINE | ID: mdl-37254872

RESUMO

We consider the evolution of phylogenetic gene trees along phylogenetic species networks, according to the network multispecies coalescent process, and introduce a new network coalescent model with correlated inheritance of gene flow. This model generalizes two traditional versions of the network coalescent: with independent or common inheritance. At each reticulation, multiple lineages of a given locus are inherited from parental populations chosen at random, either independently across lineages or with positive correlation according to a Dirichlet process. This process may account for locus-specific probabilities of inheritance, for example. We implemented the simulation of gene trees under these network coalescent models in the Julia package PhyloCoalSimulations, which depends on PhyloNetworks and its powerful network manipulation tools. Input species phylogenies can be read in extended Newick format, either in numbers of generations or in coalescent units. Simulated gene trees can be written in Newick format, and in a way that preserves information about their embedding within the species network. This embedding can be used for downstream purposes, such as to simulate species-specific processes like rate variation across species, or for other scenarios as illustrated in this note. This package should be useful for simulation studies and simulation-based inference methods. The software is available open source with documentation and a tutorial at https://github.com/cecileane/PhyloCoalSimulations.jl.


Assuntos
Fluxo Gênico , Software , Filogenia , Simulação por Computador , Probabilidade , Modelos Genéticos
6.
Genome Biol Evol ; 15(1)2023 01 04.
Artigo em Inglês | MEDLINE | ID: mdl-36582124

RESUMO

Mycoheterotrophy is an alternative nutritional strategy whereby plants obtain sugars and other nutrients from soil fungi. Mycoheterotrophy and associated loss of photosynthesis have evolved repeatedly in plants, particularly in monocots. Although reductive evolution of plastomes in mycoheterotrophs is well documented, the dynamics of nuclear genome evolution remains largely unknown. Transcriptome datasets were generated from four mycoheterotrophs in three families (Orchidaceae, Burmanniaceae, Triuridaceae) and related green plants and used for phylogenomic analyses to resolve relationships among the mycoheterotrophs, their relatives, and representatives across the monocots. Phylogenetic trees based on 602 genes were mostly congruent with plastome phylogenies, except for an Asparagales + Liliales clade inferred in the nuclear trees. Reduction and loss of chlorophyll synthesis and photosynthetic gene expression and relaxation of purifying selection on retained genes were progressive, with greater loss in older nonphotosynthetic lineages. One hundred seventy-four of 1375 plant benchmark universally conserved orthologous genes were undetected in any mycoheterotroph transcriptome or the genome of the mycoheterotrophic orchid Gastrodia but were expressed in green relatives, providing evidence for massively convergent gene loss in nonphotosynthetic lineages. We designate this set of deleted or undetected genes Missing in Mycoheterotrophs (MIM). MIM genes encode not only mainly photosynthetic or plastid membrane proteins but also a diverse set of plastid processes, genes of unknown function, mitochondrial, and cellular processes. Transcription of a photosystem II gene (psb29) in all lineages implies a nonphotosynthetic function for this and other genes retained in mycoheterotrophs. Nonphotosynthetic plants enable novel insights into gene function as well as gene expression shifts, gene loss, and convergence in nuclear genomes.


Assuntos
Genomas de Plastídeos , Orchidaceae , Humanos , Idoso , Filogenia , Genes de Plantas , Proteínas de Plantas/genética , Orchidaceae/genética
7.
J Math Biol ; 86(1): 12, 2022 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-36481927

RESUMO

Phylogenetic networks extend phylogenetic trees to model non-vertical inheritance, by which a lineage inherits material from multiple parents. The computational complexity of estimating phylogenetic networks from genome-wide data with likelihood-based methods limits the size of networks that can be handled. Methods based on pairwise distances could offer faster alternatives. We study here the information that average pairwise distances contain on the underlying phylogenetic network, by characterizing local and global features that can or cannot be identified. For general networks, we clarify that the root and edge lengths adjacent to reticulations are not identifiable, and then focus on the class of zipped-up semidirected networks. We provide a criterion to swap subgraphs locally, such as 3-cycles, resulting in indistinguishable networks. We propose the "distance split tree", which can be constructed from pairwise distances, and prove that it is a refinement of the network's tree of blobs, capturing the tree-like features of the network. For level-1 networks, this distance split tree is equal to the tree of blobs refined to separate polytomies from blobs, and we prove that the mixed representation of the network is identifiable. The information loss is localized around 4-cycles, for which the placement of the reticulation is unidentifiable. The mixed representation combines split edges for 4-cycles, regular tree and hybrid edges from the semidirected network, and edge parameters that encode all information identifiable from average pairwise distances.


Assuntos
Filogenia , Funções Verossimilhança
8.
Front Plant Sci ; 13: 876779, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36483967

RESUMO

We assess relationships among 192 species in all 12 monocot orders and 72 of 77 families, using 602 conserved single-copy (CSC) genes and 1375 benchmarking single-copy ortholog (BUSCO) genes extracted from genomic and transcriptomic datasets. Phylogenomic inferences based on these data, using both coalescent-based and supermatrix analyses, are largely congruent with the most comprehensive plastome-based analysis, and nuclear-gene phylogenomic analyses with less comprehensive taxon sampling. The strongest discordance between the plastome and nuclear gene analyses is the monophyly of a clade comprising Asparagales and Liliales in our nuclear gene analyses, versus the placement of Asparagales and Liliales as successive sister clades to the commelinids in the plastome tree. Within orders, around six of 72 families shifted positions relative to the recent plastome analysis, but four of these involve poorly supported inferred relationships in the plastome-based tree. In Poales, the nuclear data place a clade comprising Ecdeiocoleaceae+Joinvilleaceae as sister to the grasses (Poaceae); Typhaceae, (rather than Bromeliaceae) are resolved as sister to all other Poales. In Commelinales, nuclear data place Philydraceae sister to all other families rather than to a clade comprising Haemodoraceae+Pontederiaceae as seen in the plastome tree. In Liliales, nuclear data place Liliaceae sister to Smilacaceae, and Melanthiaceae are placed sister to all other Liliales except Campynemataceae. Finally, in Alismatales, nuclear data strongly place Tofieldiaceae, rather than Araceae, as sister to all the other families, providing an alternative resolution of what has been the most problematic node to resolve using plastid data, outside of those involving achlorophyllous mycoheterotrophs. As seen in numerous prior studies, the placement of orders Acorales and Alismatales as successive sister lineages to all other extant monocots. Only 21.2% of BUSCO genes were demonstrably single-copy, yet phylogenomic inferences based on BUSCO and CSC genes did not differ, and overall functional annotations of the two sets were very similar. Our analyses also reveal significant gene tree-species tree discordance despite high support values, as expected given incomplete lineage sorting (ILS) related to rapid diversification. Our study advances understanding of monocot relationships and the robustness of phylogenetic inferences based on large numbers of nuclear single-copy genes that can be obtained from transcriptomes and genomes.

9.
Bioinformatics ; 38(11): 3044-3050, 2022 05 26.
Artigo em Inglês | MEDLINE | ID: mdl-35482481

RESUMO

MOTIVATION: Kinship estimation is necessary for evaluating violations of assumptions or testing certain hypotheses in many population genomic studies. However, kinship estimators are usually designed for diploid systems and cannot be used in populations with mixed haploid diploid genetic systems. The only estimators for different ploidies require datasets free of population structure, limiting their usage. RESULTS: We present KIMGENS (Kinship Inference for Mixed GENetic Systems), an estimator for kinship estimation among individuals of various ploidies, that is robust to population structure. This estimator is based on the popular KING-robust estimator but uses diploid relatives of the individuals of interest as references of heterozygosity and extends its use to haploid-diploid and haploid pairs of individuals. We demonstrate that KIMGENS estimates kinship more accurately than previously developed estimators in simulated panmictic, structured and admixed populations, but has lower accuracy when the individual of interest is inbred. KIMGENS also outperforms other estimators in a honeybee dataset. Therefore, KIMGENS is a valuable addition to a population geneticist's toolbox. AVAILABILITY AND IMPLEMENTATION: KIMGENS and its association simulation tool are implemented and available open-source at https://github.com/YenWenWang/HapDipKinship. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Diploide , Software , Humanos , Animais , Haploidia , Genômica/métodos , Simulação por Computador
10.
Proc Natl Acad Sci U S A ; 118(33)2021 08 17.
Artigo em Inglês | MEDLINE | ID: mdl-34373325

RESUMO

Carnivorous plants consume animals for mineral nutrients that enhance growth and reproduction in nutrient-poor environments. Here, we report that Triantha occidentalis (Tofieldiaceae) represents a previously overlooked carnivorous lineage that captures insects on sticky inflorescences. Field experiments, isotopic data, and mixing models demonstrate significant N transfer from prey to Triantha, with an estimated 64% of leaf N obtained from prey capture in previous years, comparable to levels inferred for the cooccurring round-leaved sundew, a recognized carnivore. N obtained via carnivory is exported from the inflorescence and developing fruits and may ultimately be transferred to next year's leaves. Glandular hairs on flowering stems secrete phosphatase, as seen in all carnivorous plants that directly digest prey. Triantha is unique among carnivorous plants in capturing prey solely with sticky traps adjacent to its flowers, contrary to theory. However, its glandular hairs capture only small insects, unlike the large bees and butterflies that act as pollinators, which may minimize the conflict between carnivory and pollination.


Assuntos
Alismatales/fisiologia , Planta Carnívora/fisiologia , Inflorescência/fisiologia , Isótopos de Nitrogênio/metabolismo , Animais , Drosophila/química , Ecossistema , Nitrogênio/metabolismo , Isótopos de Nitrogênio/química
11.
Bioinformatics ; 37(5): 634-641, 2021 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-33027508

RESUMO

MOTIVATION: With growing genome-wide molecular datasets from next-generation sequencing, phylogenetic networks can be estimated using a variety of approaches. These phylogenetic networks include events like hybridization, gene flow or horizontal gene transfer explicitly. However, the most accurate network inference methods are computationally heavy. Methods that scale to larger datasets do not calculate a full likelihood, such that traditional likelihood-based tools for model selection are not applicable to decide how many past hybridization events best fit the data. We propose here a goodness-of-fit test to quantify the fit between data observed from genome-wide multi-locus data, and patterns expected under the multi-species coalescent model on a candidate phylogenetic network. RESULTS: We identified weaknesses in the previously proposed TICR test, and proposed corrections. The performance of our new test was validated by simulations on real-world phylogenetic networks. Our test provides one of the first rigorous tools for model selection, to select the adequate network complexity for the data at hand. The test can also work for identifying poorly inferred areas on a network. AVAILABILITY AND IMPLEMENTATION: Software for the goodness-of-fit test is available as a Julia package at https://github.com/cecileane/QuartetNetworkGoodnessFit.jl. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Genoma , Software , Sequenciamento de Nucleotídeos em Larga Escala , Funções Verossimilhança , Filogenia
12.
Syst Biol ; 69(3): 593-601, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31432090

RESUMO

Genomic data have had a profound impact on nearly every biological discipline. In systematics and phylogenetics, the thousands of loci that are now being sequenced can be analyzed under the multispecies coalescent model (MSC) to explicitly account for gene tree discordance due to incomplete lineage sorting (ILS). However, the MSC assumes no gene flow post divergence, calling for additional methods that can accommodate this limitation. Explicit phylogenetic network methods have emerged, which can simultaneously account for ILS and gene flow by representing evolutionary history as a directed acyclic graph. In this point of view, we highlight some of the strengths and limitations of phylogenetic networks and argue that tree-based inference should not be blindly abandoned in favor of networks simply because they represent more parameter rich models. Attention should be given to model selection of reticulation complexity, and the most robust conclusions regarding evolutionary history are likely obtained when combining tree- and network-based inference.


Assuntos
Classificação/métodos , Genoma/genética , Filogenia
13.
Syst Biol ; 69(3): 462-478, 2020 05 01.
Artigo em Inglês | MEDLINE | ID: mdl-31693158

RESUMO

Baobabs (Adansonia) are a cohesive group of tropical trees with a disjunct distribution in Australia, Madagascar, and continental Africa, and diverse flowers associated with two pollination modes. We used custom-targeted sequence capture in conjunction with new and existing phylogenetic comparative methods to explore the evolution of floral traits and pollination systems while allowing for reticulate evolution. Our analyses suggest that relationships in Adansonia are confounded by reticulation, with network inference methods supporting at least one reticulation event. The best supported hypothesis involves introgression between Adansonia rubrostipa and core Longitubae, both of which are hawkmoth pollinated with yellow/red flowers, but there is also some support for introgression between the African lineage and Malagasy Brevitubae, which are both mammal-pollinated with white flowers. New comparative methods for phylogenetic networks were developed that allow maximum-likelihood inference of ancestral states and were applied to study the apparent homoplasy in floral biology and pollination mode seen in Adansonia. This analysis supports a role for introgressive hybridization in morphological evolution even in a clade with highly divergent and geographically widespread species. Our new comparative methods for discrete traits on species networks are implemented in the software PhyloNetworks. [Comparative methods; Hyb-Seq; introgression; network inference; population trees; reticulate evolution; species tree inference; targeted sequence capture.].


Assuntos
Adansonia/anatomia & histologia , Adansonia/classificação , Evolução Biológica , Flores/anatomia & histologia , Polinização/fisiologia , Adansonia/genética , Flores/genética , Especificidade da Espécie
14.
PLoS One ; 14(5): e0217890, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31145764

RESUMO

[This corrects the article DOI: 10.1371/journal.pone.0076267.].

15.
J Integr Plant Biol ; 61(1): 12-31, 2019 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-30474311

RESUMO

Previous research suggests that Gossypium has undergone a 5- to 6-fold multiplication following its divergence from Theobroma. However, the number of events, or where they occurred in the Malvaceae phylogeny remains unknown. We analyzed transcriptomic and genomic data from representatives of eight of the nine Malvaceae subfamilies. Phylogenetic analysis of nuclear data placed Dombeya (Dombeyoideae) as sister to the rest of Malvadendrina clade, but the plastid DNA tree strongly supported Durio (Helicteroideae) in this position. Intraspecific Ks plots indicated that all sampled taxa, except Theobroma (Byttnerioideae), Corchorus (Grewioideae), and Dombeya (Dombeyoideae), have experienced whole genome multiplications (WGMs). Quartet analysis suggested WGMs were shared by Malvoideae-Bombacoideae and Sterculioideae-Tilioideae, but did not resolve whether these are shared with each other or Helicteroideae (Durio). Gene tree reconciliation and Bayesian concordance analysis suggested a complex history. Alternative hypotheses are suggested, each involving two independent autotetraploid and one allopolyploid event. They differ in that one entails an allopolyploid origin for the Durio lineage, whereas the other invokes an allopolyploid origin for Malvoideae-Bombacoideae. We highlight the need for more genomic information in the Malvaceae and improved methods to resolve complex evolutionary histories that may include allopolyploidy, incomplete lineage sorting, and variable rates of gene and genome evolution.


Assuntos
Genoma de Planta/genética , Malvaceae/genética , Teorema de Bayes , Genômica , Gossypium/genética , Filogenia
16.
Am J Bot ; 105(11): 1888-1910, 2018 11.
Artigo em Inglês | MEDLINE | ID: mdl-30368769

RESUMO

PREMISE OF THE STUDY: We present the first plastome phylogeny encompassing all 77 monocot families, estimate branch support, and infer monocot-wide divergence times and rates of species diversification. METHODS: We conducted maximum likelihood analyses of phylogeny and BAMM studies of diversification rates based on 77 plastid genes across 545 monocots and 22 outgroups. We quantified how branch support and ascertainment vary with gene number, branch length, and branch depth. KEY RESULTS: Phylogenomic analyses shift the placement of 16 families in relation to earlier studies based on four plastid genes, add seven families, date the divergence between monocots and eudicots+Ceratophyllum at 136 Mya, successfully place all mycoheterotrophic taxa examined, and support recognizing Taccaceae and Thismiaceae as separate families and Arecales and Dasypogonales as separate orders. Only 45% of interfamilial divergences occurred after the Cretaceous. Net species diversification underwent four large-scale accelerations in PACMAD-BOP Poaceae, Asparagales sister to Doryanthaceae, Orchidoideae-Epidendroideae, and Araceae sister to Lemnoideae, each associated with specific ecological/morphological shifts. Branch ascertainment and support across monocots increase with gene number and branch length, and decrease with relative branch depth. Analysis of entire plastomes in Zingiberales quantifies the importance of non-coding regions in identifying and supporting short, deep branches. CONCLUSIONS: We provide the first resolved, well-supported monocot phylogeny and timeline spanning all families, and quantify the significant contribution of plastome-scale data to resolving short, deep branches. We outline a new functional model for the evolution of monocots and their diagnostic morphological traits from submersed aquatic ancestors, supported by convergent evolution of many of these traits in aquatic Hydatellaceae (Nymphaeales).


Assuntos
Especiação Genética , Genomas de Plastídeos , Magnoliopsida/genética , Filogenia , DNA Intergênico , Zingiberales/genética
17.
Syst Biol ; 67(5): 800-820, 2018 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-29701821

RESUMO

The goal of phylogenetic comparative methods (PCMs) is to study the distribution of quantitative traits among related species. The observed traits are often seen as the result of a Brownian Motion (BM) along the branches of a phylogenetic tree. Reticulation events such as hybridization, gene flow or horizontal gene transfer, can substantially affect a species' traits, but are not modeled by a tree. Phylogenetic networks have been designed to represent reticulate evolution. As they become available for downstream analyses, new models of trait evolution are needed, applicable to networks. We develop here an efficient recursive algorithm to compute the phylogenetic variance matrix of a trait on a network, in only one preorder traversal of the network. We then extend the standard PCM tools to this new framework, including phylogenetic regression with covariates (or phylogenetic ANOVA), ancestral trait reconstruction, and Pagel's $\lambda$ test of phylogenetic signal. The trait of a hybrid is sometimes outside of the range of its two parents, for instance because of hybrid vigor or hybrid depression. These two phenomena are rather commonly observed in present-day hybrids. Transgressive evolution can be modeled as a shift in the trait value following a reticulation point. We develop a general framework to handle such shifts and take advantage of the phylogenetic regression view of the problem to design statistical tests for ancestral transgressive evolution in the evolutionary history of a group of species. We study the power of these tests in several scenarios and show that recent events have indeed the strongest impact on the trait distribution of present-day taxa. We apply those methods to a data set of Xiphophorus fishes, to confirm and complete previous analysis in this group. All the methods developed here are available in the Julia package PhyloNetworks.


Assuntos
Ciprinodontiformes/genética , Evolução Molecular , Fluxo Gênico , Transferência Genética Horizontal , Hibridização Genética , Filogenia , Algoritmos , Animais , Ciprinodontiformes/classificação , Modelos Genéticos , Fenótipo
18.
Syst Biol ; 67(4): 662-680, 2018 07 01.
Artigo em Inglês | MEDLINE | ID: mdl-29385556

RESUMO

To study the evolution of several quantitative traits, the classical phylogenetic comparative framework consists of a multivariate random process running along the branches of a phylogenetic tree. The Ornstein-Uhlenbeck (OU) process is sometimes preferred to the simple Brownian motion (BM) as it models stabilizing selection toward an optimum. The optimum for each trait is likely to be changing over the long periods of time spanned by large modern phylogenies. Our goal is to automatically detect the position of these shifts on a phylogenetic tree, while accounting for correlations between traits, which might exist because of structural or evolutionary constraints. We show that, in the presence of shifts, phylogenetic Principal Component Analysis fails to decorrelate traits efficiently, so that any method aiming at finding shifts needs to deal with correlation simultaneously. We introduce here a simplification of the full multivariate OU model, named scalar OU, which allows for noncausal correlations and is still computationally tractable. We extend the equivalence between the OU and a BM on a rescaled tree to our multivariate framework. We describe an Expectation-Maximization (EM) algorithm that allows for a maximum likelihood estimation of the shift positions, associated with a new model selection criterion, accounting for the identifiability issues for the shift localization on the tree. The method, freely available as an R-package (PhylogeneticEM) is fast, and can deal with missing values. We demonstrate its efficiency and accuracy compared to another state-of-the-art method ($\ell$1ou) on a wide range of simulated scenarios and use this new framework to reanalyze recently gathered data sets on New World Monkeys and Anolis lizards.


Assuntos
Adaptação Biológica , Evolução Biológica , Lagartos , Fenótipo , Platirrinos , Algoritmos , Animais , Filogenia
19.
Mol Biol Evol ; 34(12): 3292-3298, 2017 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-28961984

RESUMO

PhyloNetworks is a Julia package for the inference, manipulation, visualization, and use of phylogenetic networks in an interactive environment. Inference of phylogenetic networks is done with maximum pseudolikelihood from gene trees or multi-locus sequences (SNaQ), with possible bootstrap analysis. PhyloNetworks is the first software providing tools to summarize a set of networks (from a bootstrap or posterior sample) with measures of tree edge support, hybrid edge support, and hybrid node support. Networks can be used for phylogenetic comparative analysis of continuous traits, to estimate ancestral states or do a phylogenetic regression. The software is available in open source and with documentation at https://github.com/crsl4/PhyloNetworks.jl.


Assuntos
Biologia Computacional/métodos , Filogenia , Algoritmos , Evolução Molecular , Software
20.
J Math Biol ; 74(1-2): 355-385, 2017 01.
Artigo em Inglês | MEDLINE | ID: mdl-27241727

RESUMO

Diffusion processes on trees are commonly used in evolutionary biology to model the joint distribution of continuous traits, such as body mass, across species. Estimating the parameters of such processes from tip values presents challenges because of the intrinsic correlation between the observations produced by the shared evolutionary history, thus violating the standard independence assumption of large-sample theory. For instance (Ho and Ané, Ann Stat 41:957-981, 2013) recently proved that the mean (also known in this context as selection optimum) of an Ornstein-Uhlenbeck process on a tree cannot be estimated consistently from an increasing number of tip observations if the tree height is bounded. Here, using a fruitful connection to the so-called reconstruction problem in probability theory, we study the convergence rate of parameter estimation in the unbounded height case. For the mean of the process, we provide a necessary and sufficient condition for the consistency of the maximum likelihood estimator (MLE) and establish a phase transition on its convergence rate in terms of the growth of the tree. In particular we show that a loss of [Formula: see text]-consistency (i.e., the variance of the MLE becomes [Formula: see text], where n is the number of tips) occurs when the tree growth is larger than a threshold related to the phase transition of the reconstruction problem. For the covariance parameters, we give a novel, efficient estimation method which achieves [Formula: see text]-consistency under natural assumptions on the tree. Our theoretical results provide practical suggestions for the design of comparative data collection.


Assuntos
Modelos Biológicos , Filogenia , Fenótipo , Probabilidade
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...